Question Answering (QA) is a growing area of research, often used to facilitate the extraction of information from within documents. State-of-the-art QA models are usually pre-trained on domain-general corpora like Wikipedia and thus tend to struggle on out-of-domain documents without fine-tuning. We demonstrate that synthetic domain-specific datasets can be generated easily using domain-general models, while still providing significant improvements to QA performance. We present two new tools for this task: A flexible pipeline for validating the synthetic QA data and training downstream models on it, and an online interface to facilitate human annotation of this generated data. Using this interface, crowdworkers labelled 1117 synthetic QA pairs, which we then used to fine-tune downstream models and improve domain-specific QA performance by 8.75 F1.
translated by 谷歌翻译
The need for AI systems to provide explanations for their behaviour is now widely recognised as key to their adoption. In this paper, we examine the problem of trustworthy AI and explore what delivering this means in practice, with a focus on healthcare applications. Work in this area typically treats trustworthy AI as a problem of Human-Computer Interaction involving the individual user and an AI system. However, we argue here that this overlooks the important part played by organisational accountability in how people reason about and trust AI in socio-technical settings. To illustrate the importance of organisational accountability, we present findings from ethnographic studies of breast cancer screening and cancer treatment planning in multidisciplinary team meetings to show how participants made themselves accountable both to each other and to the organisations of which they are members. We use these findings to enrich existing understandings of the requirements for trustworthy AI and to outline some candidate solutions to the problems of making AI accountable both to individual users and organisationally. We conclude by outlining the implications of this for future work on the development of trustworthy AI, including ways in which our proposed solutions may be re-used in different application settings.
translated by 谷歌翻译
Opinion summarisation synthesises opinions expressed in a group of documents discussing the same topic to produce a single summary. Recent work has looked at opinion summarisation of clusters of social media posts. Such posts are noisy and have unpredictable structure, posing additional challenges for the construction of the summary distribution and the preservation of meaning compared to online reviews, which has been so far the focus of opinion summarisation. To address these challenges we present \textit{WassOS}, an unsupervised abstractive summarization model which makes use of the Wasserstein distance. A Variational Autoencoder is used to get the distribution of documents/posts, and the distributions are disentangled into separate semantic and syntactic spaces. The summary distribution is obtained using the Wasserstein barycenter of the semantic and syntactic distributions. A latent variable sampled from the summary distribution is fed into a GRU decoder with a transformer layer to produce the final summary. Our experiments on multiple datasets including Twitter clusters, Reddit threads, and reviews show that WassOS almost always outperforms the state-of-the-art on ROUGE metrics and consistently produces the best summaries with respect to meaning preservation according to human evaluations.
translated by 谷歌翻译
我们介绍了微博观点摘要(MOS)的任务,并共享3100个金标准意见摘要的数据集,以促进该领域的研究。该数据集包含跨越2年期的推文的摘要,并且涵盖了比任何其他公共Twitter摘要数据集更多的主题。摘要本质上是抽象的,是由熟练的记者创建的,这些记者在将事实信息(主要故事)与作者观点分开的模板之后,总结了新闻文章。我们的方法不同于以前在社交媒体中生成金标准摘要的工作,这些摘要通常涉及选择代表性帖子,从而有利于提取性摘要模型。为了展示数据集的实用性和挑战,我们基准了一系列抽象性和提取性的最先进的摘要模型,并实现良好的性能,前者的表现优于后者。我们还表明,微调对于提高性能和研究使用不同样本量的好处是必要的。
translated by 谷歌翻译
当今的冲突变得越来越复杂,流畅和分散,通常涉及许多具有多重且经常发散利益的国家和国际参与者。随着调解员努力使冲突动态有理由,例如冲突政党的范围和政治立场的演变,相关与较少相关的参与者在和平建立和认同之间的区别或身份证明,这一发展构成了冲突调解的重大挑战。关键冲突问题及其相互依存。国际和平努力似乎不足以成功应对这些挑战。尽管技术已经在与冲突相关的领域进行了试验和使用,例如预测冲突或信息收集,但对技术如何促进冲突调解的关注较少。该案例研究有助于有关在冲突调解过程中使用最先进的机器学习技术和技术的新兴研究。本研究使用也门和平谈判中的对话成绩单,通过为他们提供知识管理,提取和冲突分析的工具来有效地支持中介团队。除了说明冲突调解中的机器学习工具的潜力外,本文还强调了跨学科和参与性的共同创造方法对开发上下文敏感和有针对性的工具的重要性,并确保有意义和负责任的实施。
translated by 谷歌翻译
建立模型以检测社交媒体上的疫苗态度是具有挑战性的,因为涉及的复合材料,通常涉及复杂的方面以及带注释的数据的有限可用性。现有方法在很大程度上依赖于需要大量注释和预定义方面类别的监督培训。取而代之的是,为了利用现在可用于疫苗接种的大量未注释的数据,我们提出了一种新型的半监督方法,用于疫苗态度检测,称为Vadet。基于语言模型的变异自动编码体系结构用于从未标记的数据中学习域的主题信息。然后,该模型通过一些手动注释的用户态度进行了微调。我们验证了VADET对带注释的数据的有效性,并验证了对疫苗意见注释的现有疫苗接种语料库。我们的结果表明,Vadet能够学习分离的立场和方面主题,并且在立场检测和推文聚类上都优于现有的基于方面的情感分析模型。
translated by 谷歌翻译
近年来,苏格兰的参与式预算(PB)已从少数社区主导的过程中成长为当地和国家政府支持的运动。这是苏格兰政府与苏格兰地方当局(COSLA)之间的协议介绍,至少1%的地方当局预算将受到PB。这个正在进行的研究论文探讨了从苏格兰的32名地方当局“缩放”或“主流”出现的挑战。主要目标是评估当地的管理局使用数字平台领事,这适用自然语言处理(NLP)来解决这些挑战。该项目采用采访,对PB流程的观察以及数字平台数据的分析来采用定性纵向设计。采用主题分析来捕捉出现的主要问题和主题。然后纵向分析探讨这些随着时间的推移方式。 32个直播学习网站的潜力提供了一个独特的机会,探索离散的政治和社会背景,这些环境变化,允许更深层次的潜水到可能存在的挑战和问题,更广泛的横断面研究会错过。初始结果表明,可以使用NLP技术来解决缩放的问题和挑战,在先前的受控用案例的评估中,已显示提高公民参与的有效性。
translated by 谷歌翻译
Crop type maps are critical for tracking agricultural land use and estimating crop production. Remote sensing has proven an efficient and reliable tool for creating these maps in regions with abundant ground labels for model training, yet these labels remain difficult to obtain in many regions and years. NASA's Global Ecosystem Dynamics Investigation (GEDI) spaceborne lidar instrument, originally designed for forest monitoring, has shown promise for distinguishing tall and short crops. In the current study, we leverage GEDI to develop wall-to-wall maps of short vs tall crops on a global scale at 10 m resolution for 2019-2021. Specifically, we show that (1) GEDI returns can reliably be classified into tall and short crops after removing shots with extreme view angles or topographic slope, (2) the frequency of tall crops over time can be used to identify months when tall crops are at their peak height, and (3) GEDI shots in these months can then be used to train random forest models that use Sentinel-2 time series to accurately predict short vs. tall crops. Independent reference data from around the world are then used to evaluate these GEDI-S2 maps. We find that GEDI-S2 performed nearly as well as models trained on thousands of local reference training points, with accuracies of at least 87% and often above 90% throughout the Americas, Europe, and East Asia. Systematic underestimation of tall crop area was observed in regions where crops frequently exhibit low biomass, namely Africa and South Asia, and further work is needed in these systems. Although the GEDI-S2 approach only differentiates tall from short crops, in many landscapes this distinction goes a long way toward mapping the main individual crop types. The combination of GEDI and Sentinel-2 thus presents a very promising path towards global crop mapping with minimal reliance on ground data.
translated by 谷歌翻译
This paper is a technical overview of DeepMind and Google's recent work on reinforcement learning for controlling commercial cooling systems. Building on expertise that began with cooling Google's data centers more efficiently, we recently conducted live experiments on two real-world facilities in partnership with Trane Technologies, a building management system provider. These live experiments had a variety of challenges in areas such as evaluation, learning from offline data, and constraint satisfaction. Our paper describes these challenges in the hope that awareness of them will benefit future applied RL work. We also describe the way we adapted our RL system to deal with these challenges, resulting in energy savings of approximately 9% and 13% respectively at the two live experiment sites.
translated by 谷歌翻译
This paper presents a 1-D convolutional graph neural network for fault detection in microgrids. The combination of 1-D convolutional neural networks (1D-CNN) and graph convolutional networks (GCN) helps extract both spatial-temporal correlations from the voltage measurements in microgrids. The fault detection scheme includes fault event detection, fault type and phase classification, and fault location. There are five neural network model training to handle these tasks. Transfer learning and fine-tuning are applied to reduce training efforts. The combined recurrent graph convolutional neural networks (1D-CGCN) is compared with the traditional ANN structure on the Potsdam 13-bus microgrid dataset. The achievable accuracy of 99.27%, 98.1%, 98.75%, and 95.6% for fault detection, fault type classification, fault phase identification, and fault location respectively.
translated by 谷歌翻译